perm filename BYMRG.MSS[RDG,DBL] blob sn#693386 filedate 1983-01-05 generic text, type C, neo UTF8
COMMENT āŠ—   VALID 00009 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	@Make[Report]
C00006 00003	@Chapter[What does it mean to "Understand" an Analogy]
C00009 00004	@SubHeading[Informal Specification for Analogy]
C00011 00005	@SubHeading["New" Facts]
C00019 00006	@SubHeading["Analogical" Facts]
C00025 00007	@SubHeading[Final Comments ]
C00030 00008	@Chapter[Notes]
C00040 00009	Acknowledgements
C00044 ENDMK
CāŠ—;
@Make[Report]
@Device[DOVER]
@LIBRARYFILE<SPECIALCHARACTERS>

@Comment< @Case{Device,	FILE "@Style(Endnotes)",
		PAGEDFILE "@Style(Endnotes)"} >

@Modify[Verbatim, Break Before]
@Modify[Description, Spread 0, Spacing 1.3, LeftMargin +10, Indent -10]
@Modify[Quotation, Indent 0]
@Modify[Itemize, Referenced <@#>]
@Modify[Enumerate, Referenced <@#@;#@:.@1@,@#.@a@,@#.@i>]
@Modify[Appendix, Numbered <@A.>, Referenced <@A>]
@Modify[Equation, FaceCode T]
@Comment{ Brian said this would work... Ha!
 @Modify[NoteCounter, Referenced <@#>] }
@Style[References=STD Alphabetic]
@DEFINE[Aside=NoteStyle, LeftMargin -16, Indent 0]
@DEFINE[SubAside=Quotation,Font SmallBodyFont,FaceCode I,Spacing 1,Spread 0.5]
@DEFINE[Subsubsection,Use HdX,Font TitleFont3,FaceCode I,Above .3inch,Centered]
@DEFINE[ENUM0=ENUMERATE, NumberFrom 0, SPREAD=0, Above=0.3 line, Below=0.1 line]
@DEFINE[ENUM1=ENUMERATE, SPREAD=0.1, Spacing=1, Above=0.3 line, Below=0.1 line]
@TEXTFORM[BEGENUM1 = 
	"@BEGIN<ENUMERATE, SPREAD=0.1, Spacing=1, Above=0.3 line, Below=0.1 line>"]
@TEXTFORM[ENDENUM1 = 
	"@END<ENUMERATE>"]
@DEFINE[ITEM1=Itemize, Spread=0.1, Spacing=1, Above=0.3 line, Below=0.1 line]
@DEFINE[DESC1=Description, Spread=0, Spacing=1, Above=0.3 line, Below=0.1 line]
@DEFINE[TEXT1 = TEXT, USE NoteStyle]
@Define[NoIndent,Continue Forced,Break Around]

@EQUATE[YY=R]

@Counter[EquationCounter, Numbered <(Facts @1)>, Referenced <(Facts @1)>,
IncrementedBy tag, INIT 0]

@Comment{ Now to keep from Breaking in front of chapters. }
@DEFINE(HdChap,Use HD1,PageBreak Off,Above .6inch)
@Counter(Chapter,TitleEnv HdChap,
	ContentsEnv tc1,Numbered [@1.],IncrementedBy Use,Referenced [@1],Announced) 
@Comment{How annoying!   Without this, the section numbers don't get reset!
	Note this is exactly what's in REPORT.MAK[scr,sys]. }
@Counter(Section,Within Chapter,TitleEnv HD2,ContentsEnv tc2,
	  Numbered [@#@:.@1],Referenced [@#@:.@1],IncrementedBy Use,Announced)

@Use(Bibliography =  "GENL.BIB[RDG,DBL]")
@Use(Bibliography =  "REPN.BIB[RDG,DBL]")
@Use(Bibliography = "META4.BIB[RDG,DBL]")


@Define[F1,FaceCode B,TabExport] 
@COMMENT<
@Specialfont(F1="UN510E")
This is from the "Univers" family.
   @Define[Fancy,Font = F1,FaceCode B,TabExport] 
 Later make this better! >
@Chapter[What does it mean to "Understand" an Analogy]

Before discussing how to communicate an analogy,
let us consider what it means to "understand" an analogy --
that is, how will one's corpus of facts will be alterred
as an analogy is ingested.
(We will later deal with mechanism by which this message (signal?)
can be conveyed.)

We view "analogy understanding" as a type of learning,
in that the hearer now knows something new.
@****** MRG: Must this be the case, or could it be otherwise?
Would it count if the only new fact is this analogical connection?*****@*
Furthermore, that addition fact must be analogical,
another concept which remains to be defined.
We will address these two points in turn --
discussing first what it means to learn something new,
and then what it means for a fact to be considered analogical.

@SubHeading[Informal Specification for Analogy]

First, some informal specifications
@Comment[? simplying assumptions ?]
to define 
@Comment[? constrain ?]
the sort of analogies we will (and will not) be considering.
@BEGIN[Itemize]
An analogy connects a pair of things --
which may be objects, ideas, relations, or what-have-you.

The purpose of an analogy is to enhance the hearer's understanding
of one of the analogues, by virtue of some known features of the others.
As our vocabulary implies, we are taking the model of analogy
as an communication act, where the speaker is trying to convey a
body of facts about an object, A, to a hearer; and does so by describing
this A is being like B, in some manner C.@Foot{
We will later discuss various enhancements which spring from this theme:
when both the speaker and the hearer are the same individual;
times when that "manner C" can be left implicit (not explicitly stated);
the relevnce of the speaker's model of hearer, and vice versa;
etc.}
See also Note @Ref[NonAnal].
@END[Itemize]

@SubHeading["New" Facts]
What does it mean for some fact to be "new"?
Let us call the hearer's initial body of knowledge, @T[Th].
(In this situation, these are the facts he knows before hearing the analogy.)
What must be true of a statement @g[s], for it to be new, wrt @T[Th]?
By cases,
@BEGIN[Enumerate]
@Tag[alreadyin]
@g[s] can already be in @T[Th] -- @g[s] @K[MemberOf] @T[Th].

@Tag[deduce]
@g[s] may be deducable from @T[Th] -- @g[s] @K[MemberOf] @T[DC(Th)],@*
where @T[DC] takes a theory onto its deductive closure.@Foot{
To make this split a true partitioning, we should eliminate the deriving axioms
from the closure...}

@Tag[incons]
@g[s] can be inconsistent with @T[Th] -- @K[Not]@g[s] @K[MemberOf] @T[DC(Th)]

@Tag[indep]
Independence (or "none of the above") --
@g[s], @K[Not]@g[s] @K[NotMemberOf] @T[DC(Th)].
@END[Enumerate]

Under which of the above case(s) can a fact be considered new?
Certainly not case @Ref[alreadyin].
As we are, for now, considering a theory to be deductively closed,
case @Ref[deduce] degrades to case @Ref[alreadyin], and
should, for the same reason, be eliminated.
@******MRG: I'd still like to examine this position more carefully.*****@*
We also do not want to consider inconsistent theories;
which would happen in case @Ref[incons]
after we (pursue our goal of) adding @g[s] to @T[Th].
(See Note @Ref[SemClos].)
This eliminates that case as well,
leaving only the case @Ref[indep].

While this constraint is necessary, it is far from sufficient.
Nothing so far prevents us from adding in a random sentence,
composed of new symbols which are unrelated to anything in the
current theory.
Such weak statements seem to convey no real meaning.@Foot{
Consider the analogy statement itself: @T[Analogous(A B ...)].
While this statement may be independent of the existent @T[Th],
it may not actually communicate anything new; 
until and unless it is somehow connected with the other facts in @T[Th].}
To formalize this intuition
we need to consider the model theoretic notion
of possible interpretations.

Given any theory @F1[T], there are a set of interpretations,
@T<@F1[I]@-{T} @K[Eq] @K[LeftBrace] I@-{j} @K[RightBrace]>,
where each @T<I@-{j}> maps each symbol of 
@F1[L], the language of @F1[T], onto objects or sets of tuples of objects,
in the real world.
(I.e. each constant is mapped onto an object, (in the real world,)
and each n-ary relation onto the its characteristic set --
those n-tuples which satisfy it.)

Given any symbol @T[c],
we can define the image of that symbol under an interpretation,
@T<I@-{j}[c]> as that thing -- be it object, or set of tuples.
We can similarly define the ?image? of the set,
@T<@F1{I}[c] @K[Eq] 
@K[LeftBrace] I@-{j}[c] @K[Modulo] I@-{j} @K[MemberOf] @F1[I] @K[RightBrace]>.

We present some important facts before proceeding:
@BEGIN[Enumerate]
@BEGIN[Multiple]
The "range" of each instantiation,
(a.k.a. a model of @T<Th>,)
will include objects in the "real world",
as opposed to vague, theoretic symbols.
Hence elements of the one model may overlap with those of another.
In particular,
@T<I@-[i][A] @K[Eq] I@-[j][A]>, where @T<i @K[Neq] j>;
so the number of elements of @T<@F1[I][A]> may be fewer than
the size of @T<@F1[I]>.

It is also possible that different symbols, in different instantiations,
may refer to the same real world object.
@****** Is this important? *****@*
For example, the symbols @T<MRG> and @T<MRG'> may refer to the real world objects
(whose canonical referent are) 
@T<Prof Micheal R. Genesereth> and @T<Dr Milton R. Grinberg>, respectively,
in one model,
but oppositely in another.
@END[Multiple]

@T<@F1{I}@-[T'][c] @K[SubSet] @F1{I}@-[T][c]>, where
@T<T @K[SubSet] T'>,
as additional facts can only further restrict the range of
possible interpretations for any symbol.
(As interpretation is not well defined for inconsistent theories,
of course we insist that @T<T> and @T<T'> are each consistent.)
@END[Enumerate]

We can use the second observation to address the question of what it means to 
"learn" something new:
We will claim that some statement, @g[s], contains some information
wrt a theory @T<T> and a symbol @T<c>, 
@T<New( T, c, @g[s] )>,
@BEGIN[Equation]@T<
New( T, c, @g[s] ) @K[IFF] @F1{I}@-[T'][c] @K[ProperSubSet] @F1{I}@-[T][c],
>@END[Equation]
where @T<T' @K[Eq] T @K[Union] @K[LeftBrace] @g[s] @K[RightBrace]>.
@****** Should I state explicitly that @K[ProperSubSet] is ProperSubSet? *****@*

This criterion is non-trivial
 -- there are many independent sentences,
@T<@g[f][A]>, ones which deal with @T[A],
which do NOT limit the number of instantiations of @T[A].
Note @Ref[NonLim] describes several such sentences.

As the goal of understanding an analogy is to learn something new about
one of the analogues, @T[A], we will require that the analogy
convey some proposition @g[s] such that @T<New( Th, A, @g[s] )>.

@SubHeading["Analogical" Facts]

Now that we know what it means to be "new" in general,
we can ask what it means to be new in an analogical sense.
(That is, everything above dealt with what it means for a fact to be
considered new, wrt a theory, and a symbol.
It is, as such,
applicable to any learning experience, not just those related to analogies.)
This subsection will discuss what it means to be an analogy,
within the context defined above,
and based on the notion of learning as articulated above.

First, what does it mean for two things to be analogous?
We intuitively think this means they are, in some manner, similar.
This intuition is trivial to formalize:
We consider @T[A] and @T[B]
@***** MRG - should I use the same symbols (fonts) for the symbols
as for their referents?  *****@*
to be analogous if some assertion is true of both of them --
that is, if there is some sentence, @g[f], such that both
@T<@g[f][A]> and @T<@g[f][B]> are true.

Recall now that the purpose of analogy is educational
-- that is, our mission is to find some @g[f] such that
@T<New( Th, A, @g[f][A] )>.
Is it enough to find some sentence @g[f] such that
@T<@g[f][B]> is in @T[Th], and
@T<@g[f][A]> is independent of @T[Th] -- i.e.
@BEGIN[Equation]@T<
@Tag[Defn-f]
@g[f][B] @K[MemberOf] @T[Th] @K[And] @g[f][A] @K[NotMemberOf] @T[Th] @K[And] @~
@K[Not]@g[f][A] @K[NotMemberOf] @T[Th]?
>@END[Equation]

The answer, it turns out, is yes.
We have to prove, first, that this corresponds to an analogy connection,
and second that this would be a new fact about @T[A].
The first is trivial:
@g(f) is the property which both @T[A] and @T[B] share.
The second part is more difficult.
@****** Actually, it's not even true!  Redo everything below! *****@*

By assumption @T<@g[f][A]> is independent of @T[Th];
it remains only to show that
@T<@F1{I}@-[Th'][A] @K[ProperSubSet] @F1{I}@-[Th][A]>, where
@T<Th' @K[Eq] T @K[Union] @K[LeftBrace] @g[f][A] @K[RightBrace]>.

The easiest way of showing this is by constructing the model of @T<Th'>
which has an interpretation of @T<A> which cannot be amoung 
@T<@F1{I}@-[Th'][A]>.
We simply construct a Henken model of @T<Th'>,
(technically the model corresponding that theory...,)
and complete it with @T<@K[Not]@g[f][A]>.
@****** MRG, RWW: Is this correct, and legal? *****@*
Clearly this interpretation of @T<A> will not be among the
interpretations of @T<Th'>, @T<@F1{I}@-[Th']>, QED.

@SubHeading[Final Comments ]
We present some closing comments
before leaving this topic
(to address the issue of what goes into the statement of an analogy).
many of which will motivate ...

@BEGIN[Enumerate]
As should be obvious,
we are attempting to be descriptive here, as opposed to prescriptive.
That is, we have yet to say anything about how to generate such a @g[f],
given @T[Th], @T[B] and @T[A].
Some hints 
(relating to things like the speaker's knowledge, and hearer's model thereof;
and truth/validity/...)
will be given later in this paper;
a fully descriptive procedure is for a different paper.

A related point is that this description 
only addresses what constitutes a legal analogy,
but says nothing about what it means to be a good (or even acceptable) analogy.
Again the next section will hint at some comparative criteria.

Everything above will still work, even if we do not start
with an explicit symbol for the analogue in question --
it might be some implicit ... (e.g. the value of a function, or ...)
It just gets trickier -- having first to reify ...
There are many ways around this issue; but it has to be mentioned.
@****** MRG - That still has to be shown *****@*

@BEGIN[Multiple]
We have intentionally skirted the issue of efficiency,
and assumed that arbitrarily competent theorem provers are at work.
In any real application we would be dealing with resource limited processors.
Here one purpose of an analogy statement might be to direct the hearer
to focus on some collection of facts he already knew,
rather to expose him to new facts.

Again, that explantion will be the fodder for a different paper.
@END[Multiple]

@Chapter[Notes]

@BEGIN[Multiple]
@Tag[SemClos]
We have to be very careful here.
Realize that the conditions
@BEGIN[Equation]@T<
@TAG[Independent]
@K[Not] T @K[RightTurnstile] @g[s] @K[And]
@K[Not] T @K[RightTurnstile] @K[Not]@g[s]
>@END[Equation]
do NOT guarantee that
@T<T' @K[Eq] T @K[Union] @K[LeftBrace] @g[s] @K[RightBrace]> will be consistent.
Let @g(s) be Godel's famous "I am unprovable" sentence and
@T<T @K[Eq] "the theory of arithmetic">.
While @Ref[Independent] holds, 
adding @g[s] to @T<T> will mean there is a proof for that @g(s),
-- i.e. @T<T' @K[RightTurnstile] @g(s)> --
contradiction.
@****** Yoram - what do I mean here? *****@*
@END[Multiple]

@BEGIN[Multiple]
@Tag[NonLim]
There are a host of sentences which say nothing about a given constant,
wlog @T<A>.
First, obviously,
almost any sentence which does not contain the symbol @T<A> will
not contribute to @T<Th>'s knowledge about @T<A>.
(The above claim is clearly not universal;
a sentence which does not mention @T<A> could easily imply something 
limited the possible interpretations.
For example, the claim that
@T<@K[ForAll] x,y. (Father x y) @K[Implies] (Ancestor x y)>
does not include the symbol @T<Fred>, but it could eliminate certain possible
interpretations of that symbol, based on additional facts
(about @T<Fred>, @T<Father> and @T<Ancestor>) in @T<Th>.)

Is it possible to construct sentences which @i{do} contain the symbol @T<A>,
and which are independent of @T<Th>,
which nonetheless do not limit the number of interpretations of @T<A>?

The answer seems "no":
The statement that @T<@K[Not](Th @K[RightTurnstile] @g[f][A])>,
eliminates the cases when @T<@g[f][A]> is trivial (e.g. a tautology);
and it seems that any statement that non-trivially involves @T<A>
must say something new about @T<A>.

The argument goes as follows:
Asserting @T<@g[f][A]> would not eliminate any member of
@T<@F1[I]@-[Th][A]> if it
"just happened" to be true for all those members.
That would mean that @T<@g[f][A]> is true in all models of @T<Th>;
@T<Th @K[DoubleTurnstile] @g[f][A]>.
But, by Godel's Completeness Theorem, this means that
@T<Th @K[RightTurnstile] @g[f][A]>, as
@T<(Th @K[DoubleTurnstile] @g[f][A]) @K[Implies] (Th @K[RightTurnstile] @g[f][A])>.
But this directly contradicts the initial assumption that
@T<@K[Not](Th @K[RightTurnstile] @g[f][A])>.

Hence there must be some interpretation, @T<I@-[j]>,
in which this @T<@g[f][A]> is false;
and this @T<I@-[j][A]> must therefore be thrown out of
@T<@F1[I]@-[Th'][A]>, proving that
@T<@F1[I]@-[Th'][A] @K[Neq] @F1[I]@-[Th][A]>, as desired.

There is one important flaw with this argument:
All we've really proven is that @T<@F1[I]@-[Th'] @K[ProperSubset] @F1[I]@-[Th]>,
which need not imply that
@T<@F1[I]@-[Th'][A] @K[ProperSubset] @F1[I]@-[Th][A]>.
There may, for example, be another interpretation, @T<I@-[k]>,
which remained in @T<@F1[I]@-[Th']>, and
which mapped @T<A> onto this same element --
i.e. @T<I@-[j][A] @K[Eq] I@-[k][A]>, meaning that
@T<@F1[I]@-[Th'][A]> might still equal @T<@F1[I]@-[Th][A]>.

For example, suppose we had
@BEGIN[Equation]
Th @K[Eq] @~
@K[LeftBrace] [A @K[MemberOf] @K[LeftBrace] 0,1 @K[RightBrace]], @~
[C @K[MemberOf] @K[LeftBrace] 0,1 @K[RightBrace]] @K[RightBrace]
@END[Equation]
There are four obvious interpretations,
depending on the values of @T<A> and @T<C> --
@BEGIN[Verbatim]
		A	C
   ----------------------
   I@-[0]	0	0
   I@-[1]	0	1
   I@-[2]	1	0
   I@-[3]	1	1
@END[Verbatim]
Hence @T<@F1[I]@-[Th][A] @K[Eq] @K[LeftBrace] 0,1 @K[RightBrace]>.
Now we add in the independent statement
@BEGIN[Equation]@T<
@g[f][A] @K[Equivalent]   A @K[Eq] C
>@END[Equation]
to form @T<Th'>.
While this only leaves two of the four interpretations, 
@T<I@-[0]> and @T<I@-[3]>, 
@T<@F1[I]@-[Th'][A]> remains @T<@K[LeftBrace] 0,1 @K[RightBrace]>;
indicating that @T<@g[f][A]> said nothing new about @T<A>.

Another method of generating sentences which
are independent of @T<Th> and involve @T<A>,
but which add no new information about @T<A>,
involves extending @F1[L], the language of @T<T>.
Take for example, the sentence @T<Floogle(A)>,
where @T<Floogle> is not in @F1[L].
(The only essential feature is that nothing in @T<Th> relates to
this @T<Floogle> relation.
Otherwise, we would need to resort to some type of extensible signature table,
and other hairy messes.)
On introducing this relation, we now have a factor of @T<2@+[n]> more models
of @T<T>, where @T<n> is the number of constants (? constructable elements ?) --
depending on whether @T<Floogle(x)> is true for that element @T<x>, or not.

While the number of interpretations, @T<@F1[I]@-[Th']>, will clearly increase,
the number of instantiations of @T<A>, <@F1[I]@-[Th'][A]> will not.
There is nothing in @T<A>'s newly noted @T<Floogle>@i<hood> which can
eliminate any members of that set of interpretations.
Hence we claim that
@T<@K[Not](New Th, A, Floogle(A) )>;
i.e. the sentence @T<Floogle(A)>,
which clearly deals with @T<A>,
says nothing new about @T<A>.
(Realize that this was the case above when we added in @T<Analogous( A, B, ... )>,
when we had no rules which joined this @T<Analogous> relation (symbol?)
to anything then in @T<Th>.)

That is an advantage of knowing that @T<Th @K[Turnstile] @g[f][B]> --
clearly all the symbols of @g(f) must be in this @T<@F1[L]@-[Th]>.
@END[Multiple]

@BEGIN[Multiple]
@Tag[NonAnal]
It is worth mentioning various cases of NON-analogy:
(If we allowed these to count as analogy, the above premises would not
be sufficient.)

@BEGIN[Itemize]
It might be to tell the hearer that you, the speaker, happen to
also know this something.

It might be the basis of induction --
from
@BEGIN[Equation]
MRG went to good school,
& RDG is like MRG (in terms of education)
@END[Equation]
deduce 
@BEGIN[Equation]
all stanford CS people well educated.
@END[Equation]
This is induction, NOT analogy.
@END[Itemize]
@END[Multiple]

Yoram said I had to insist that the theory
includes some properties of the universe (the range of each
interpretation) -- in particular, its size, or at least, that it is
infinite (larger than the language -- so can add a witness, or 	
whatever...)
I still don't understand why...

@END[Enumerate]
Acknowledgements
To DBL,
BCM,
Yoram Moses - for helping me sort out what I should have meant while
floundering in model theory, ...